Custom Tool Development Guide
This guide explains how to develop custom tools within the Agentic Browser framework. It covers the tool creation lifecycle: schema definition, function implementation, registration, testing, validation, and debugging. It also documents best practices for naming, error handling, documentation, performance, security, distribution, and maintenance.
The framework organizes tools under a dedicated tools/ namespace, integrates them into an agent tool registry, and exposes them through structured LangChain tools. Supporting services, prompts, and sanitization utilities provide robust runtime behavior.
Diagram sources
Section sources
StructuredTool wrappers: Tools are defined as LangChain StructuredTool instances with typed Pydantic schemas and coroutine implementations.
Tool schemas: Pydantic models define input fields, constraints, and descriptions for validation and documentation.
Tool implementations: Async coroutines orchestrate service calls, external APIs, or scraping utilities.
Tool registry: A builder function composes tools dynamically, optionally injecting credentials or context.
Service layer: Dedicated services encapsulate LLM prompting, sanitization, and domain-specific logic.
Validation and sanitization: Utilities enforce safe JSON action plans and guardrails for scripts.
Section sources
The tool development architecture follows a layered pattern:
Tool layer: Defines inputs and async logic.
Agent layer: Composes tools and injects context.
Service layer: Encapsulates LLM prompts, sanitization, and domain operations.
Extension layer: Executes browser actions and interacts with the page.
Diagram sources
Tool Creation Pattern#
Follow this repeatable pattern to implement a new tool:
Define a Pydantic input schema with Field constraints and descriptions.
Implement an async coroutine that validates inputs, orchestrates work, and returns a normalized result.
Wrap the coroutine in a StructuredTool with a descriptive name and schema.
Register the tool in the agent builder function.
Service/API/Scraper"] Orchestrate --> Return["Return Normalized Result"] Return --> Wrap["Wrap in StructuredTool"] Wrap --> Register["Register in Agent Builder"] Register --> End(["Ready"])
Diagram sources
Section sources
Schema Definition Best Practices#
Use Pydantic Field constraints (min_length, ge, le, HttpUrl, EmailStr) to enforce input validity early.
Provide clear descriptions for each field to aid LLM reasoning.
Prefer optional fields with sensible defaults when appropriate.
Reuse shared schemas across related tools to maintain consistency.
Examples of schema patterns:
URL and question-based tools: WebsiteToolInput, YouTubeToolInput
OAuth-enabled tools: GmailToolInput, CalendarToolInput
Action-focused tools: BrowserActionInput
Section sources
Function Implementation Patterns#
Use asyncio.to_thread for blocking operations to avoid blocking the event loop.
Normalize outputs to strings or structured JSON for downstream consumers.
Handle missing credentials gracefully and return actionable error messages.
Apply bounds checking for numeric parameters.
Examples:
Web search pipeline: web_search_pipeline
Website markdown fetcher: clean_response
YouTube info extraction: get_video_info
Gmail send: send_email
Calendar event creation: create_calendar_event
Section sources
Registration and Dynamic Composition#
Tools are registered via a builder that accepts context (e.g., tokens, session payloads).
Partial functions inject default credentials to avoid requiring users to pass tokens every time.
Conditional tools are added only when credentials are present.
Key references:
Section sources
Browser Action Tool: End-to-End Flow#
The browser action tool demonstrates the full lifecycle: schema, service invocation, sanitization, and structured output.
Diagram sources
Section sources
Extension Integration for Browser Actions#
The extension executes browser actions based on tool commands. It logs and routes tool types to specific handlers, returning structured results or errors.
Diagram sources
Section sources
Tool schemas depend on Pydantic for validation.
Tools depend on LangChain StructuredTool for registration.
Agent builder composes tools and injects context.
Services depend on prompts and sanitization utilities.
Extension depends on browser APIs for action execution.
Diagram sources
Section sources
Use asyncio.to_thread for blocking I/O to prevent event loop stalls.
Bound numeric inputs (e.g., max_results) to reasonable ranges.
Limit DOM previews and interactive element listings to avoid token overhead.
Cache or reuse expensive computations when feasible.
Prefer lightweight scrapers and minimize network calls.
[No sources needed since this section provides general guidance]
Common issues and resolutions:
Invalid JSON action plans: The sanitizer enforces required fields and safe patterns. Review validation messages and adjust tool outputs accordingly.
Missing credentials: Tools return explicit error messages when tokens are absent; ensure context injection via the agent builder.
Unexpected API responses: Wrap external calls in try/except blocks and normalize error messages.
Extension errors: The extension logs tool types and errors; check the console for actionable diagnostics.
References:
Section sources
By following the schema-first, service-backed, and sanitized tool creation pattern, you can reliably add new capabilities to the Agentic Browser. Use the agent builder to register tools, leverage the extension for browser actions, and apply the sanitization utilities to ensure safety and reliability.
[No sources needed since this section summarizes without analyzing specific files]
Step-by-Step: Implementing a New Tool#
Define a Pydantic input schema with Field constraints.
Implement an async coroutine that validates inputs, performs work, and normalizes output.
Wrap the coroutine in a StructuredTool with a descriptive name and schema.
Add the tool to the agent builder and inject defaults via context when applicable.
Test with representative inputs and edge cases.
Validate outputs with the sanitizer and extension integration.
Section sources
Testing Strategies#
Unit tests for blocking operations: Mock external APIs and assert normalized outputs.
Integration tests: Use the agent builder with mock context to exercise tool composition.
Sanitization tests: Provide malformed JSON and invalid action types to verify robustness.
End-to-end tests: Execute browser actions through the extension and verify results.
Section sources
Validation Approaches#
Pydantic schema validation for inputs.
JSON action plan validation and safety checks.
Error wrapping and user-friendly messages.
Section sources
Debugging Techniques#
Log inputs and outputs at each stage.
Use try/except around external calls and return structured error payloads.
Inspect extension logs for tool routing and execution outcomes.
Section sources
Best Practices#
Naming: Use descriptive, consistent names (e.g., verb_noun).
Error handling: Fail fast with clear messages; avoid leaking secrets.
Documentation: Include field descriptions and constraints in schemas.
Security: Avoid unsafe script patterns; sanitize inputs and limit permissions.
Performance: Minimize blocking calls; bound resource usage.
Section sources
Templates and Examples#
Browser action tool: browser_action_agent
Web search tool: websearch_agent
Website tool: website_agent
YouTube tool: youtube_agent
Gmail tools: gmail_agent, gmail_send_agent, gmail_list_unread_agent, gmail_mark_read_agent
Calendar tools: calendar_agent, calendar_create_event_agent
PyJIIT tool: pyjiit_agent
Section sources
Security Considerations#
Avoid dangerous script patterns in EXECUTE_SCRIPT actions.
Validate and constrain inputs to prevent injection.
Limit tool capabilities to least privilege.
Sanitize and log only non-sensitive information.
Section sources
Distribution and Maintenance#
Package tools under tools/
/ with clear module boundaries. Keep schemas and tool names stable; deprecate gradually.
Provide CLI entry points for manual testing where applicable.
Maintain changelogs and update agent builder when adding/removing tools.
Section sources